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Abstract 

Background: When mathematical modelling is applied to many different application areas, a common task is the 
estimation of states and parameters based on measurements. With this kind of inference making, uncertainties in the 
time when the measurements have been taken are often neglected, but especially in applications taken from the life 
sciences, this kind of errors can considerably influence the estimation results. As an example in the context of 
personalized medicine, the model-based assessment of the effectiveness of drugs is becoming to play an important 
role. Systems biology may help here by providing good pharmacokinetic and pharmacodynamic (PK/PD) models. 
Inference on these systems based on data gained from clinical studies with several patient groups becomes a major 
challenge. Particle filters are a promising approach to tackle these difficulties but are by itself not ready to handle 
uncertainties in measurement times. 

Results: In this article, we describe a variant of the standard particle filter (PF) algorithm which allows state and 
parameter estimation with the inclusion of measurement time uncertainties (MTU). The modified particle filter, which 
we call MTU-PF, also allows the application of an adaptive stepsize choice in the time-continuous case to avoid 
degeneracy problems. The modification is based on the model assumption of uncertain measurement times. While 
the assumption of randomness in the measurements themselves is common, the corresponding measurement times 
are generally taken as deterministic and exactly known. Especially in cases where the data are gained from 
measurements on blood or tissue samples, a relatively high uncertainty in the true measurement time seems to be a 
natural assumption. Our method is appropriate in cases where relatively few data are used from a relatively large 
number of groups or individuals, which introduce mixed effects in the model. This is a typical setting of clinical 
studies. We demonstrate the method on a small artificial example and apply it to a mixed effects model of 
plasma-leucine kinetics with data from a clinical study which included 34 patients. 

Conclusions: Comparisons of our MTU-PF with the standard PF and with an alternative Maximum Likelihood 
estimation method on the small artificial example clearly show that the MTU-PF obtains better estimations. 
Considering the application to the data from the clinical study, the MTU-PF shows a similar performance with respect 
to the quality of estimated parameters compared with the standard particle filter, but besides that, the MTU algorithm 
shows to be less prone to degeneration than the standard particle filter. 

Keywords: Particle filter, Sequential Monte Carlo methods, Nonlinear filtering, Parameter estimation, Measurement 
time uncertainties, PK/PD, Mixed effects, Leucine kinetics 
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Background 

Measurement time uncertainties 

Uncertainty in the time at which a measurement is taken 
is an often neglected source of random error. While in 
many application areas, this kind of error is generally 
small and indeed neglectable (due to automated measure- 
ments and precise timings), in others it may be of real 
influence, especially in the life sciences. As a prominent 
example, one may consider pharmacokinetic and pharma- 
codynamic (PK/PD) models which are used to describe 
the metabolic interactions and the effects of a chemical 
agent (like a drug or a labelled substance) over time inside 
an organism, respectively. 

A typical population experiment in the PK/PD con- 
text consists in the analysis of the contents of the blood 
plasma of several individuals with respect to concentra- 
tions of certain molecules of interest. For this purpose, 
blood probes have to be taken from each individual at cer- 
tain (fixed) time points after a certain event has occurred 
(e.g. a drug or a labelled substance has been applied). It 
is clear from the setting of the experiments that there is 
some variation in the real point in time when the blood 
probe has been taken: the true time when the measure- 
ment value has been obtained might be shortly before 
or after the intended time, and this true measurement 
time is not known to us. Since the inclusion of those 
time uncertainties in the model usually makes the analysis 
more difficult, it is standard to lump the time uncertain- 
ties with the measurement error. But especially at early 
times when concentrations change quickly, this may easily 
lead to wrong estimations, even if one assumes very high 
variances of the measurement error (we will demonstrate 
this later on a simple example). On the other hand, the 
inclusion of measurement time uncertainties (MTU) in 
algorithms aiming at inference making in complex mod- 
els is not straightforward. In this article, we will present 
a modification of the Particle Filter (PF) algorithm (which 
we call MTU-PF) which is able to fully include a statistical 
model of the time uncertainties. 

Inference in complex systems 

The assessment of the effectiveness of a drug in a clinical 
study has been done in the past by the direct computa- 
tion of relatively simple statistical values. The enormous 
increase in complexity of the underlying models, due 
to present developments in medicine and biology, for 
instance in the areas of personalized medicine or systems 
biology, increases also the need for more sophisticated 
model-based inference methods. 

The estimation of unobservable internal variables or 
model parameters from data which have been obtained 
from blood or tissue samples at several time points can 
reveal information on the concentrations and effective- 
ness of the substance under question. If these data come 



from individuals which belong to two different (or even 
more) groups, e.g. test and control group, mixed effects 
are introduced in the underlying models. The inherent 
non-linearity and high variability of biological processes 
adds considerably to the difficulties one faces during the 
inference step. Inference in connection with dynamic 
models plays a major role in many other application areas. 
State and parameter estimation as well as model discrimi- 
nation and validation are most common, but also optimal 
control problems should be mentioned. 

It is often not enough to consider (independent) mea- 
surement noise [1]. Correlations between residuals are not 
uncommon, and the violation of this statistical assump- 
tion may lead to wrong estimates. A natural way to 
include correlated noise is to model two different types 
of noise: the dynamic (process or system) noise which is 
present in the dynamics of the system states and origi- 
nates either from true random fluctuations in the system 
or from unmodelled dynamics in the system, and the mea- 
surement noise which is introduced by the measurement 
procedure or equipment and modelled by independent 
residuals. One possible approach is to use state space 
models which consist of a time-continuous model for 
the system states, e.g. based on Stochastic Differential 
Equations (SDEs), and a separate model for the time- 
discrete measurements. 

Parameter estimation with Maximum Likelihood approach 

Parameter estimation in state space systems is a diffi- 
cult problem. In a context where the system dynamics 
are modelled by Ordinary Differential Equations (ODEs) 
without correlated noise, the problem is most often 
considered as a (deterministic) optimization problem 
based on a Maximum Likelihood (ML) formulation. An 
overview of these approaches can be found in [2] and [3]; 
see also [4], which consider also other aspects like iden- 
tifiability. A generalization of the ML approach including 
more flexible cost functions is given by the prediction 
error estimation method ([5]). The introduction of system 
noise in the state variables leads to optimization prob- 
lems with SDE constraints. In this case, internal system 
states which cannot be directly observed need to be esti- 
mated jointly with the parameters, given the data. For 
this purpose, the parameter estimation methods must 
be augmented by appropriate state filtering methods. An 
overview of ML parameter estimation in these types of 
models is given in [6]. If the SDEs are non-linear, lineariza- 
tions to the Kalman Filter, like the Extended Kalman Filter 
(EKF) or the Unscented Kalman Filter (UKF), are used to 
establish approximations to the means and covariances of 
the filter distributions over time. All those approximations 
suffer from the fact that they approximate the filtering dis- 
tributions of the states by a Gaussian distribution at all 
time points and cannot adequately approximate skewed 
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or multimodal distributions. Better approximations are 
provided by simulation based methods like Sequential 
Monte Carlo (SMC) algorithms where good conver- 
gence results have been established ([7]). Nevertheless, 
they suffer from several drawbacks when applied to the 
joint estimation of dynamic states and fixed parameters 
([8-10], see also [11]). 

Parameter estimation in a Bayesian context 

In a Bayesian context, in contrast to the "classical" ML 
approach, a prior distribution is assigned to the parame- 
ter vector, hence the parameters can be treated as random 
variables. In this sense, parameter estimation is done by 
evaluating the so-called posterior distribution which can 
be computed (at least theoretically) by Bayes' theorem 
given the observations (measurements) and the prior dis- 
tribution. In the context of high-dimensional spaces, this 
requires the computation of high-dimensional integrals 
which is not possible to do analytically. For this purpose, 
Markov Chain Monte Carlo (MCMC) methods provide 
powerful tools for the computation of simulation-based 
approximations to the posterior distribution. Again, in 
the context of the joint estimation of dynamic states and 
fixed parameters, the design of good proposal densities is 
a very difficult problem which renders the use of standard 
MCMC methods like the Metropolis-Hastings sampler 
impractical for the purposes of parameter estimation in 
state space systems. 

It has long been a wish to combine both (dynamic) 
SMC and (static) MCMC methods to provide a general 
tool for the joint estimation of dynamic states and static 
parameters. Only recently, Andrieu et al. [11] proposed a 
very promising combination of both types of Monte Carlo 
approaches called Particle Markov Chain Monte Carlo 
(PMCMC) which is generally applicable and where also 
convergence has been proved. 

In the present article, even though the PMCMC 
approach might be the preferred method for parameter 
estimation in state space systems, we will concentrate 
solely on the SMC methods, since our modification affects 
only this part. However, to be able to do parameter esti- 
mation in a pure SMC context, we rely on an approach 
that is very often used to avoid problems with the esti- 
mation of constant parameters. This approach consists 
in the introduction of artificial dynamics in the param- 
eters, that means the parameters are allowed to slightly 
change their values over time. In this way, and in a 
Bayesian context, the parameters can be treated exactly 
in the same way as the system states. After building an 
augmented system state by concatenating the parameter 
vector and the state vector, the joint estimation of states 
and parameters reduces to filtering of the augmented state 
vector which makes SMC methods directly applicable to 
the problem. 



Particle filters for state and parameter estimation 

Particle niters ([12-14]) belong to the class of SMC meth- 
ods for state filtering in state space models. Using the 
state augmentation approach, the method is also capable 
of estimating system parameters. The standard particle fil- 
ter is designed for discrete, non-linear, and non-Gaussian 
models and can routinely be adapted to the continu- 
ous case with measurements at discrete times. The idea 
of the particle filter is that, at each time point, there is 
a sample based representation (the weighted particles) 
of the current estimate of the inner states and parame- 
ters which is based on the measurements that have been 
obtained up to the current time point. The particle cloud 
is then propagated through time, and the particles and 
weights are updated accordingly at each time point where 
measurements are available. 

Non-Linear Mixed Effects models 

Estimation in a Non-linear Mixed Effects model (NLME) 
involves the estimation of both global and individual 
parameters. With classical maximum likelihood estima- 
tion, the individual parameters are random variables 
equipped with a distribution while the global parameters 
remain constants with a "true" but unknown value. If the 
underlying model equations are non-linear, this leads to 
likelihood functions which are not analytically accessible 
and one has to rely on approximations. In the context 
where the system dynamics are modelled by ODEs, the 
most popular algorithm for NLME parameter estimation 
in the PK/PD context is the tool NONMEM ([15]). In 
[1] an estimation algorithm for NLME models based on 
Stochastic Differential Equations (SDEs) was proposed 
that uses the First-Order Conditional Estimation (FOCE) 
method to approximate the likelihood in combination 
with the EKF estimation in the SDEs. This has been added 
to NONMEM ([16]). In [17], a comparison between ODE 
and SDE based parameter estimation has been performed 
which showed that the interindividual variabilities were 
in general estimated to be smaller for the SDE model. 
Donnet and Samson ([18]) proposed a stochastic version 
of the Expectation-Maximization (SAEM) algorithm (for 
the estimation of the global parameters) in combination 
with MCMC methods (for the estimation of states and 
individual parameters). However, since MCMC exhibits 
slow mixing properties in the context of the estima- 
tion of states and parameters in state space models, in 
[19] MCMC has been replaced by the more promising 
PMCMC approach of Andrieu et al. ([11]). 

On the other hand, in a Bayesian context, also the global 
parameters are equipped with a (prior) probability dis- 
tribution, and the conceptual difference between global 
and individual parameters vanishes. The mixed effects 
model can then be considered as a hierarchical model with 
dependent parameters ([20,21], see also [22] for a more 
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recent population-based Bayesian approach to PK/PD 
modelling). Simulation-based (Monte Carlo) methods can 
easily be adapted to this case. Nevertheless, the above 
mentioned challenges to both SMC and MCMC methods 
are even higher due to the increased number of states and 
parameters in NLME models (the number of states and 
individual parameters has to be multiplied by the number 
of individuals). 

Aim of the article 

Our goal is two-fold: Firstly, we want to show that the par- 
ticle filter algorithm is applicable (with our modifications) 
also to more complex models when time uncertainties are 
formulated explicitly. Secondly, we want to show that the 
modification may even provide the possibility for further 
enhancement of the performance of the algorithm by pre- 
senting an adaptive time-stepping scheme which is only 
possible in the context of the new algorithm. 

We do not claim that our MTU algorithm generally per- 
forms better or worse than the standard filter, nor that 
it should be the preferred method for estimation in non- 
linear mixed effects models. Rather, we provide a method 
which is usable for models where time uncertainties may 
play a major role. In these cases, it may indeed lead to bet- 
ter estimations. On the other hand, our method transfers 
the time-discrete particle filter approach, where updates 
based on the measurements very strictly depend on the 
measurement times, to a truly time-continuous approach, 
where updates to the filtering distributions can be per- 
formed at every point on the time-scale. Since we want to 
focus on the time uncertainties, we neglect discussing fur- 
ther issues like identifiability, model evaluation and model 
discrimination. In our application to the model of plasma- 
leucine kinetics, we try to avoid these issues by providing 
ad-hoc values to some of the parameters (especially to the 
variances of the system states). 

Motivating example 

Let us have a look at an example for illustrating the ben- 
efits of a separate modelling of measurement time uncer- 
tainties. Let us consider a state space system given by the 
ODE 

dq(t) = (-aq(t) + p) dt 

with parameters a, /3 e R. Here q(t) e R denotes the 
state of the system at time t. We call the state trajecto- 
ries obtained by this deterministic system the nominal 
evolutions of the states. We add noise to the system in a 
standard way by introducing an additional term a dWt, 
with a standard Wiener process (Wt) t eR^° an d a diffu- 
sion parameter a e R. This leads to the following SDE 
describing the evolution of the state q over time: 

dq{t) = (-aq(t) + p) dt + o dW*. 



Furthermore, let the initial state #(0) be given by a log- 
normal distribution with parameters fi qo and a 2 Q (mean 
and variance of the logarithm of #(0), respectively). The 
parameters chosen in our implementation of this example 
are shown in Table 1. 

We assume that M measurements of the state q(t) will 
be taken at times tj and that each measurement y, j = 
1,...,M is disturbed by normal noise with mean q{tj) 
and with a fixed variance o^, i.e. measurement / is dis- 
tributed according X.ojj ~ J\f(q(tj), a 2 ). Usually, the times 
tj are assumed to be known. In contrast, we will assume 
that in addition to the measurement value error, there is 
some uncertainty about the exact times where the mea- 
surements have been taken. If we attempt to take the ;th 
measurement at the intended (or nominal) time ij, the 
measurement in fact takes place at time tj which may be 
shortly before or after the intended time tj. A natural way 
to model these uncertainties is to assume that the mea- 
surement time tj is given as a realization of a random 
variable Tj. In our example, we assume that Tj follows a 
truncated normal distribution given by the density 



Yjitj) := 



exp 
0 



(tj - tj) 2 
0.3 2 



if max{0, tj — 1} 
< tj < tj + 1, 
otherwise 



with normalizing constant 



■v-r- 

Jmax{0,tj-i} \ v.3 



and given intended measurement times tj. Figure 1 shows 
the different distributions for one measurement j for all 
possible intended times tj. In each of the subfigures (a)- 
(d), the shaded green area gives an impression of the 
"density" of the distribution of the measurement value jj 
in dependence of the intended measurement time tj on 
the #-axis, while the dark-green dashed line depicts the 
nominal evolution of the state q over time. Subfigure (a) 
shows the distribution of the measurement values with 
time uncertainties, while (b)-(d) depict the distribution of 
the measurement values with known measurement times 
(in this case tj = tj), for several standard deviations o y : in 



Table 1 Parameters for the motivating example 


true value of or 


1 


true value of /3 


3 


o 


0.05 




log(l) 




0.1 


Oy 


0.005 


distribution of 7} 


jsf(tj, 0.3 2 ) truncated at tj ± 1 and at t 0 
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Figure 1 Assumed measurement distributions for the motivating example. Measurement distribution resulting from (a) separate modelling of 


measurement time uncertainties and measurement value uncertainties, with o y 




0.005, and (b-d) lumped time-and-value uncertainties with 


several different assumed lumped measurement variances oy. The dashed dark-green line depicts the nominal evolution of the state q over time. 


The green shaded area depicts the region where the measurements are expected 





(b), the original standard deviation is used, while in (c) and 
(d), higher standard deviations are used which correspond 
to the cases with lumped value and time variations. 

Comparing Figures 1(a) and l(b-d), we observe that the 
distributions of the measurements exhibit clearly differ- 
ent shapes. For the "true" model depicted in Figure 1(a), 
if we consider a single point in time that lies in a time 
segment where the state values change quickly, the distri- 
bution of the measurement at this certain point in time is 
quite broad. The variance in the measured value is very 
high, whereas it is small in time segments where the state 
values change slowly. In contrast, for the standard parti- 
cle filter, the measurement variance is constant and hence 
the assumed measurement distributions differ remarkably 
from the "true" distributions, howsoever the value of Oy is 
chosen. It must be expected that this leads to difficulties 



when inference on the states and parameters needs to be 
done based on these models. We will resume our exam- 
ple after having presented the MTU particle filter and will 
show that this is indeed the case. 

Methods 

We divide this section into three subsections. In the 
first subsection, we fix the state and observation model 
we want to consider. In the second subsection entitled 
"Standard case" we outline the standard particle filter 
algorithm in the context of time-continuous states with 
time-discrete measurements, and the various probability 
distributions involved. Although nothing is new in this 
subsection, it serves several purposes. Firstly, the time- 
continuous case is relatively rarely considered in the liter- 
ature; secondly, the derivation of our modification needs 
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a slightly more general formulation than it is standard 
for the discrete-time filter; and lastly, the comparison of 
our modified version with the standard case might more 
clearly reveal the differences between the two approaches. 
In the third subsection entitled "MTU particle filter" we 
present our new modification of the particle filter. In the 
following section "Results and Discussion" we compare 
the new MTU particle filter to the standard particle fil- 
ter and to an alternative Maximum Likelihood estimation 
method on a simple artificial example. We also present 
an application of our MTU-PF method to a PK/PD study 
in a non-linear mixed-effects setting in direct comparison 
with the standard particle filter. 

Note: a list of all used symbols with a short explanation 
can be found at the end of this paper. 

The model 
State process 

Let (Q, A, P) be a probability space and for each t e 
[ to, oo) with to e R let (X t , Bx t ) be an arbitrary measur- 
able space. For each t e[to, oo) let further X t : Q —> X t be 
an A-Bx t measurable random variable such that X^oo) 
:= (Xt)te[to,oo) is a continuous- time Markov process with 
general state space 

<*woo) : = n 

t 0 <S 

For each t G [ to, oo), denote by Cx t the pushforward 
measure of P under X t , i.e. C Xt (B) : = P(X~ l (B)) for 
all B e Bx v Further, denote by £x [tQ>oo) the pushfor- 
ward measure of P under X[ tQf00 ) : = (X s ) se [ t0)OO ) (with the 
corresponding product algebra). Analogously, denote by 

X [toA : = I I X s for each t > t 0 

t 0 <s<t 

the state space restricted to the interval [ to, t], and denote 
by £x [t0tt ] the corresponding pushforward measure. For 
each s and t with t > s > to, let K s j(x s , dx t ) be the Markov 
kernel of the process X[ tQ)OQ ^ from time s to time t. 

An important special case for X[ t0iOO ) is given by a mul- 
tidimensional Ito process on X t = M. n (equipped with 
the corresponding Borel a -algebra) denned through a 
stochastic differential equation (SDE) 

dX t = a(X t , t) dt + B(X t , t) dW t 

with drift a(x, t), diffusion matrix B(x,t), multidimen- 
sional standard Wiener process Wt, and initial variable 
X to . In this case, it is possible to sample directly (at 
least approximately) from the kernels K S)t when a suit- 
able discretization method is applied, for instance the 
Euler-Maruyama method. 



Observations / measurements 

Let the process X[ tQ)OQ ^ be observed via M random vari- 
ables Y\ : m with values in measurable spaces (yj,By.). 
Each single observation (measurement) Yj depends on the 
state variable X tj at some time tj and on the observation 
time (measurement time) tj itself. We assume that, given 
the observation time tj and the state X tj = x t ., the variable 
Yj is independent of all other variables, and the condi- 
tional measure can be expressed via some conditional 
probability density g } (yj \ x tj , tj) with respect to a reference 
measure fiy. on (3^,6^). We do not require any further 
restrictions on g such as linear dependence on the states 
or Gaussianity. 

Observation / measurement times 

The observation times (measurement times) tj for ; = 
1, . . . , M are usually assumed to be deterministically given 
and known. Our variant of the particle filter will be based 
on the assumption that the observation times tj are them- 
selves realizations of random variables Tj. These variables 
model the uncertainty about exact observation times. In 
contrast to the observation variables Yj, the observation 
times Tj are never observed (measured). We assume that 
all information available to us is their probability distri- 
bution on the half axis [ to, oo), while in the case of the 
observations Yj, we know both the densities gj(jj \x tj ,tj) 
and the observed values yj. 

In this article, we will only consider the simplest case 
where each variable Tj is independent of all others. 
Dependencies between the T/s, especially concerning the 
order of the observation times, may be considered natural 
but would lead to more complicated algorithms. How- 
ever, order dependencies can easily be introduced via 
restrictions on the support of the variables. In general, the 
probability distribution of every single variable Tj shall 
be given by a density Yjitj) with respect to the Lebesgue 
measure h[t 0 ,oc) on the interval [ to, oo). 

In the following, we will consider the two cases men- 
tioned, where either all tj are deterministic and known 
or all tj are random and unknown. Note that the first 
case formally coincides with the case that tj is random 
but observed. We will therefore stick to the notation 
gj(yj | Xtj, tj) for the observation densities in both cases. 

Standard case: measurement times deterministic and 
known 

We will first consider the standard case, where the obser- 
vation times tj are known. For simplicity, we assume 
here that the observation times t\ : M are strictly ordered 
increasingly, i.e. to < t\ < • • • < tM< 

The standard case of the particle filter is usually for- 
mulated for discrete-time Markov processes X tQ . M : = 
(Xtj)je{o,...,M} w ith general state space where the state vari- 
ables are only denned at the initial time to and at the times 
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t\, . . . , tM when measurements occur. Nevertheless, this 
case is included in our more general framework where X t 
is defined for all t > to. One just focuses on the state 
variables for those times only. In view of the later general- 
ization to random observation times, we will consider the 
fixed values tj as realizations of random variables 7) and 
condition all occurring densities on them. As mentioned 
above this assumption leads to the same results as if we 
assumed the values tj to be given deterministically. 

Full model and filter model 

The full model is given by the joint density of the vari- 
ables X tQ:M and Y\ : m (conditioned on the observation 
times Ti : M = ^lm) with respect to the product measure 

M 

f X t0:M ,Y hM | Tv.M {XtQMiyi:M | tvM) := Y[ gj (yj | Xtp tj ). 

7=1 

(1) 

The filter at a given time is based on a reduced model. 
This model is given by the joint density of the variables 
X to . k and Yix (conditioned on T\ : m = ^lm) with respect 
to the product measure £x tQk Ylj=i l^yf- 

k 

f X tQ:k ,Y 1:k | Ti:M (Xtok)yhk i tvM) Y[ gj(y . i ^ tj y (2) 

7=1 

This density is based on the state sequence X to . k . In con- 
trast, we can focus on the single state X t]< by considering 
the joint density of the variables X tk and (given T\-m = 

h-.M) with respect to Cx tk HI/Li Vyy It can be computed by 
marginalization as follows: 

f Xt *' YhklThM (Xt k ,yi:k\h:M) 



X dCx tO:k (3) 

and the filter density at time with respect to Cx tk can 
then be computed with Bayes' theorem: 

.f Xt *' Yl:klTl:M (Xt k ,yi:k\h:M) 



f Xt ^ YhhTl:M (Xt k \yi:hh:M) := 



f YhklThM (yi:k\tl:M) 



(4) 



with 

f YhklThM (yi:k\h:M) 

f X tQ:k ,Y 1:k | T, M ^ yhk ! h:M) d£ ^ 



fx** 



the particle filter computes a Monte Carlo approxima- 
tion using the fact that the filter densities/^ ' Yl - k,Tl:M can 
be computed recursively. This is done in two steps. We 
consider the filter distribution at time given by the 
probabilities 

P (Xt^ e B | Yix-i = y 1:k - h T hM = t hM ) 

= f f x ^ lY ^> T ™ {x tk _ x \yv.k-i,t hM ) dC Xtk i (x tk _ x ) 



(6) 



for each set B e Bx tk x . We then get first the predic- 
tion distribution, i.e. the distribution of X t]< given the data 
(and T hM )> by use of the kernel K tk _ lt t k : 

?(X tk e B | Yix-i = yi:k-v Ti :M = t hM ) 

= 11 f Xt ^ 1 Yhk - 1,ThM I yi*-i, h: M ) 
Jb Jx tk _ x 

x dCx^ {x tk _ v dx tk ) 

(7) 

for each set B e Bx tk ♦ Then we use Bayes' theorem to get 
the filter distribution at time t^: 

P(X tk e B | Y 1:k = y 1:h T 1:M = t hM ) 

gk(yk\*t k ,tk) 



-L 



B fYk I Yhk-l,Ti-.M (y k \y hk _ 1 , ti-M) 



(8) 



(9) 



(5) 

For general (non-linear) models, the practical computa- 
tion of the filter density is very difficult. Nevertheless, 



f /^- 1 '^- 1 ' ri:M (%_ 1 l^-l^l:M) 
J *tk-l 

X d£ ^_: ( X t k -1> dx t k ) 

for each set B e Bx tk > with normalizing constant 

fY k I Y 1:k _ v T 1:M (y k | y hk _ lf tl:M ) 

-= / gk(yk\xt k ,tk> 

Jx tk 

J *tk-l 

x dCx^ KJ^-iA ( x t k -v dxt k ) . 



Importance sampling 

Another ingredient for the particle filter is sequential 
importance sampling. We assume that a second Markov 
chain X tQ:M on the same state space is given with push- 
forward measures jC^ and kernels Ktj-i,tj (xt h i> dxtj) for 
j = 1, ...,M. We assume that for each x t ._ x e X t ._ v 
the measure Ktj_ lt tj (%tj-i> •) is absolutely continuous with 
respect to the measure K tj _ l)t) ■,(x tj _ l , •). It follows that 
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the Radon-Nikodym derivative (written as conditional 
density) 



Kt hl ,tj(xt hl , dx tj ) 



Kt hl ,tj(xt hl , dx tj ) 

exists. We further assume that the pushforward measure 
Cx tQ under X to is absolutely continuous with respect to the 
corresponding pushforward measure under X tQ with 
Radon-Nikodym derivative 

d ^x tQ (x t0 ) 



Qt 0 (x to ) := 



For sequential importance sampling, we need to be able to 
sample from the initial measure C^ t and from the kernels 

for each x t ._ x e X t ._ v and to compute Qt 0 (xt 0 ) as well as 
Qtj | t h i (x tj I x thl ) pointwise. 
Using 

K tk _ lt t k (x tk _ v dx tk ) = Qt k \t k _ x {x tk \x tk _^)k tk _ litk (x tk _ v dx tk ), 



we can then write the recursive formula (8) for the filter 
distribution at time t k as 

P(X tk e B | Y hk = y hh T hM = t hM ) 

gk(yk\xt k >t k ) 



-L 
L 



Bf YklYhk - l > TlM (yk\yi:k-l>tl:M) 

A- 1 lYhk - l > ThM 



Kt k _ lt t k (xt k _ v dx tk ) 



(10) 



for each B e Bx tk . The direct computation of the normal- 
izing constants f Yk 1 Yhk - l ' ThM (yk\yi:k-i> t\. M ) (while y hM 
is assumed to be fixed) is not necessary. Sequential impor- 
tance sampling is performed as follows. Draw a number N 
of realizations x\ Q from C^ t and compute the correspond- 
ing unnormalized weights 

w t 0 : = Qto (4 0 ) for all / = 1, ... , N. 

Then, for all k = 1, . . . ,M, sample realizations ^ from 

the kernel Kt k _ lf t k (x l tkl , dx tk ) for each i = 1, . . . ,iV and 
compute the unnormalized weights 

< : = Qt k | (4t I *Li) 1 *V '*) "Li for a11 

For suitable integrable functions h (e.g. fulfilling some 
mild restrictions on how fast h may increase with x, see 
[23] for details), the expectation of h with respect to the 



filter density conditioned on the observations Y\ :k = y\ :k , 
given by 

E [h (X tk ) | Y 1:k = y 1:h T hM = t hM ] 

= J /** I Y i*> T w (x tk 1 j/ 1:/o t 1:M )h(x tk ) dC Xtk (x tk ) , 

(11) 

can then be approximated by 



^t k ,N [h (X tk ) I Y hk = y hk , T hM = t hM ] := 



£f =1 < 



(12) 



where N is the number of particles. In fact, it can be shown 
that as N approaches infinity, these empirical expectations 
converge to the filter expectations: 

lim E t N [h(X t ) | Y 1:k = y 1:k , T hM = h :M ] 

= E [h (X tk ) | Y 1:k = y 1:k , T hM = h :M ] • 

Note that if we can sample from the Markov kernels 
of Xtj, we can choose X t . = X tj (at least in law), whence 
Q tQ (x to ) = 1 and Qtj | tj-i (x tj \ x tj _^) = 1. This is a standard 
choice, but in terms of efficiency of the particle filter algo- 
rithm not always the best one. On the other hand, finding 
a suitable Markov chain X tQ:M different from X to . M is not an 
easy task. 

Resampling 

If the number N of samples through time is fixed, the sam- 
ples obtained by sequential importance sampling quickly 
degenerate since most of the normalized weights decrease 
rapidly towards 0. The degree of degeneracy is often mea- 
sured by an estimate of the so-called effective sample size 
(ESS). This estimate at time t is given by 



^ESS 



where 



Ef =1 W 



£5U < 



(14) 



(15) 



are the normalized weights. It obtains its maximal value 
N if all weights are equal, and it approaches 1 if the vari- 
ance of the weights and thus the degree of degeneracy 
increases. To avoid this degeneration of the samples, a 
resampling step needs to be done when the ESS drops 
below a threshold A/xhreshold (which is usually chosen to be 
N/2). 

Resampling at some time si is based on given non- 
negative (unnormalized) selection weights v l S£ for each 
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particle index /: One repeatedly selects particles with 
probabilities p l t given by the normalized selection weights 



pi ■■= 



L^v—l V S£ 



(16) 



This is multinomial resampling. There exist procedures 
where each single particle is still selected with probability 
p\, but with reduced overall variance, for instance strati- 
fied resampling or systematic resampling which should be 
preferred [24,25]. In any case, resampling defines a selec- 
tion function t£ : I -> / on the index set / : = {1, . . . , N}. 
Resampling is then done by replacing the state samples 

fe) _i m by tne se l ecte d state samples (x L s £ S^ ) 

it l—L,...,N A \ / i—l,...,N 

Since before selection the probability that the particle i 
will be chosen is p\ for each draw, the expected number 
of times that particle i has been chosen after N draws is 
Np L ^ l \ To correct for the introduced bias, the normalized 
weight w l H for each selected particle i needs then to be 
corrected by replacing it by the weight 



(17) 



(using (16)). The necessary correction is therefore 
achieved if the unnormalized weights (w l s )i=i tmmmt M 
are replaced by the corrected unnormalized weights 
\w Se /v S£ )i=i,...jj. 

Note that in the original particle filter, the selection 
weights v l S£ at time si are chosen to be the particle weights 
(before the replacement), i.e. 

v ii =<i for/ = 1,...,7V, 

such that after the resampling step the unnormalized 
weights are all equal to 1. Nevertheless, in general their 
choice is free and may be based on the observations 
(which is used in the so-called auxiliary particle filter [26]). 

Particle filter algorithm 

The particle filter computes the state realizations and 
weights recursively through time. In its standard form, 
the particle filter can be stated as in algorithm 1. Note 
that if one chooses X[ tQ)OQ ) = X[ t0fOO ) (in law), then 

Qtk\tk-i( x t k \ x t k -i) = 1 an d tne update of the weights 
simplifies to 



Algorithm 1 Standard particle filter 



{At time to:} 

Sample N state realizations (x l tQ ) ._ 1 N of X tQ , with N large, 
for i = 1, . . . , N do 

Set the weight w l tQ = Q to (x l tQ ). 

end for 

for all times tj <} k = 1, . . . , M do 

{Resample the particle set (x l tk _^j ^ ^ if necessary (e.g. if the ESS drops below a threshold):} 

Generate a selection function i according to some selection weights \ v\ ) 

V k l /i—l,...,N 

for i = l,...,Ndo 



Replace the state sample x\ k l by the selected state sample x L ^_ x . 
Replace the unnormalized weight w\ k _ x by the corrected weight m/^^/v^^. 

end for 

for i — 1, . . . ,N do 

Sample a realization x\ from the Markov kernel 



15: Update the weight 

< = Qt k i tk _ x (4 k I *Li) & (yk 1 4 k > t k ) w\ k l . 

16: end for 

17: For given suitable integrable functions /z, compute estimates 

T!Li < k h(4 k ) 

Et k ,N[h(Xt k ) I Y hk = y 1:k , Tim = h-.M\ '= — — tt ; ♦ 

Ei=i < 

18: end for 
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Data Likelihood 

Model validation or discrimination is generally based on 
the data likelihood 



7 ■ 



Zt k {tvM)-=f l * ]TlM iyi:k\tvM) 

f X t0 ,,Y, k I T, M ( Wi k , ACx ^ 

= E[f x '°*' Yhk}TlM (-,yi:k\hM)] 

(18) 

for given observations y\.fc. Without resampling, the data 
likelihood could be approximated by the empirical mean 
of the unnormalized weights, i.e. by 

1 N 

Z tk (t 1:M ):=-J2< k (19) 



i=l 



because this is the empirical estimate for the above expec- 
tation. After a resampling step, this is not valid any longer. 
Nevertheless, in any case, the data likelihood can be com- 
puted recursively by the following estimate of the ratio 

Zt k (tl'M)/Zt k -i(tl'M)* 

Z^m) _ ££i I fr-i H 1 ) gk ( yk 1 tk ) ^-i 



(20) 



with initial estimate Z tQ (t\ : M) = 1 (see e.g. [27]). 



MTU particle filter: Uncertain measurement times 

We now assume that each observation time tj is a realiza- 
tion of a random variable 7). Its distribution is expressed 
via densities yj with respect to the Lebesgue measure 
^[fo,oo)« The observation times tj themselves are not 
observed. 

Full model 

The full model in this case will include complete con- 
tinuous state paths, since the observation times are now 
distributed over the complete time axis [ to, oo), and the 
observations may potentially depend on every state %t j 
for tj <E[to,oo). Consider therefore the joint density of 
the variables X[ t0)OO ^, Y\ : m and Tim, with respect to the 
product measure C X[tQt00) Y\fii ^ X^L\ * 

M M 
f^YvMJvM (x [tQjOQ) ,y l:M , tl:M ) = Y\gj(yj I ^) Y[ Yj{tj\ 

j=l j=l 

(21) 

Filter model 

The filter at a given time t > to is again based on a reduced 
model. This model is given by the joint density of the 
following variables: X[ tQ j], denoting the state paths until 
time t; further only those variables Yj for which Tj < t; 



and finally T\ : m< This density is given with respect to the 
product measure C X{tQ>t] U^i I~[ 7 =i hto,°°) b Y : 



M 



M 



j^^.ti.n^ri^f^^^^ t ^ _ Y^gjiyj I x tr tj) Y[ Yjitj). 



7=1 

tj<t 



(22) 



Note that we cannot use the simple notation of the stan- 
dard case where for filtering only the first k observations 
are taken into consideration at time ^, since neither the 
observations are ordered in time nor the times tj are fixed 
in advance. For this reason we have to include all mea- 
surements Y\ : m also into the filter model. Note that even 
though we use the complete data Y\ : m = yi:M in the nota- 
tion, only those yj have to be known at time t for which 
tj < t holds. To avoid confusion, we mark all densities 
connected to the filter model at time t by a hat superscript 
(and by the index t). 

We will now derive formulas for the filter density. 
Since we assume that the observation times T\ : m are not 
observed, we use marginalization to get the joint density 
for X[ tQ>t ] and Y\ : m only, which is 

ft (X[to,t],yiM) 

= " j?^' ^ 1M (X[to,t]>yi:M, h-M) dt\ • • • dt M 

Jto Jto 

pOO p OO M M 

= I " Y\gj(yj I *t r tj) n Yjltj) d*i • • • dt M 
Jto Jto j =1 j =1 



(23) 



with respect to the product measure £>x [tQ t] Y\j=i ^yy We 
will further simplify this density. If we define 



gj,t(yj\x tj ,tj) := 



gj(yj\xtj,tj) \itj < t 
1 otherwise, 



then 

M M 

Y\gj (yj I x tp h) = Tim (yj i *tf> ^ 
j=i j=i 

tj<t 

and further 



ft (x[t<>,t],yiM) 



pOO p oo 

= / / X\gihi\Xt r tl)X\ Y j{tj)dt X 

Jt 0 Jto j =1 j =1 

tj<t 

poo poo M 

= •••/ n(swfoi*^)w<#) 

Jto Jto j =1 

M poo 

= ]1 / 1 X *J> tj)Y;(tj) dtj, 



• • dt A 



- dtM 



(24) 
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where the last step is possible because the factor indexed 
by j does not depend on ty for / ^ j. For each j, we can 
split the integration by tj at the time point t into two parts 
and get 

/ gj,t(yj\xt r tj)yj(tj) 

= / gj hj I xtj> tj) Yj(tj) + / Yj(tj) dtj 
Jt 0 Jt 

= 1 + f {gjiyj I *t;, tj) - 1) Yjitj) dtj 
Jt 0 

where the last step follows from the fact that yj is a 
probability density and therefore 

nt r OO r OO 

/ yj(tj) dtj + / yj(tj) dtj = / yj(tj) dtj = 1 
Jt 0 Jt Jt 0 

holds. Inserting this into (24), we get 

ft {X[to,t]>yi:M) 

= 11 i 1 + / I *;) - 1) d ';) • 

With a further marginalization, we get the joint density of 
X t and Y\ : m for the filter model, 



f t Xt ' YhM (Xt,yi:M) 



-k 



*[tQ,t] eX [tQ,t] 'Xt=xt 



^'^^(x^thyhM) dC X[t0tt] (X[t 0 ,t}) 



*[tQ,t] eX [tQ,t] 'Xt=xt\ 



n (* + j (gjty i **>> v - i ) vfa) dt^ dc X[toit] (x M ) 

(25) 

which is with respect to the product measure 
Cx t Y^jLi fly;* From this density, we finally can compute 
the filter density with respect to Cx t by applying Bayes' 
theorem: 



ft (x t I y hM ) = , Y — 

ft 1M (yi:M) 



(26) 



where 

ft YlM (yi:M) = f f t Xt ' YhM (Xt,yi:M) d£ Xt (x t ) (27) 

is the data likelihood with respect to the measure 
II/W 

Effective computation of the filter distributions 

In the following paragraph, we will show how the densities 
of the filter distributions given by (26) can be effectively 
computed. This is the basis for the formulation of our 
MTU particle filter method. 



Let the observations y\ : M e 
be given. For each time t G [ to, oo) and for each ; e 
{1, . . . ,M}, we define random variables 

Wj, t : ^ -> M>o and W t : Q -> M> 0 

by the following system of ODEs 

dW ht (co) = ^(yi |X,(o>),f) - 1) n(0 d£ (28) 

dWJvf^M = (smCm - 1) y M (t) dt 

for each co e Q with initial values 

W lttQ ((D) = ... = WM,t*((o) = l, (29) 
and by 

M 

Wt = Y\ W j,t> (30) 
7=1 

We will show that for each set A in the a -algebra gener- 
ated by the variable X t , it holds that 

/f ljl:M) d£ Xf (*f) = . (31) 

J n W t (co) d?(co) 

where ff l ' Yl:M is the filter density. That means we can 
use the processes Wj )t and W t to compute the filter dis- 
tributions through time. From this, it follows immediately 
that we can also compute filter expectations. Indeed, for 
any real-valued measurable function h on X such that 
E[|A(*t)|] < oo, it holds that the expectation of h(X t ) 
given Y\ : m = yv.M with respect to the filtered state X t 
defined by 

Uh{X t ) | Y 1:M =yi: M ] : = E^ tlYl . M( lyiM) [h(X t )] 

= [ j?' 1 YhM (x t | y 1:M )h(x t ) dC Xt (x t ), 

JXt 

(32) 



/ 

Jx t (A) 



is given by the following equation: 

Et [h(X t )\Y hM =yi: M ] = 



J Q W t (co)h(X t (co))dP(co) 



f Q W t {co) dP(a>) 



(33) 



To show our assertion, we consider the processes Wj }t 
for j = 1, . . . ,M. According to (28) and (29), each Wj )t is 
denned as 

W M = 1 + T (gjiyj I X tj , tj) - 1) Yjitj) dtj, (34) 
so, with (30), 

M M ( t 

Wt = n Wj, t = n i + / (gjiyjixtf.tj) - 1) 

7=1 7=1 V Jf o 

(35) 
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holds. Thus, to prove (31), we have to show that for each 
set A from the a -algebra generated byX t , 



R> 0 and w t : X [t0)t] (Q) 



l>o by setting for each X[ tQ)t ] e 



f A w t {co) dPM = ( r r 

In W t (a>) d?(co) \J Xt(A) J { X[tQ , t] eX [tQ , t] :~x t =x t ] 



X dC X (X [t0 , t] ) d£ Xt (Xt) J / f t YhM (yi:M) 



holds (see (25) and (26)). It is enough to show the equality 
for the numerator, i.e. 



W t (co)dP(co) 



■f I 

Jx t (A) Jl 



*[toA eX [to,t] ■ x t =xt\ 



/ ft 
7=1 V Jt ° 

x dc x M (x[t 0 ,t])dC Xt (xt) 



since the equality of the denominator follows then imme- 
diately from the special case A = Q and from the fact 
that 

fI lM iyv.M) =11. . 

Y\( 1 + £ (&(yj I ~* ti > - 1) Yjitj) dt^j 

x dC Xit 0 ,,] (*[*.»]) 

Using the variable transformation {X[ to> i\(a>),X t (co)) = 
(%o,t]>*t)>weget 

/ W t (a,)dP(a») 

Jx t (A) J {x[t 0 ,t]£X[t 0 ,t} :x t =xt] 

jl (l + £ fety I */) - 1) W (fy) dfy) 
This is what we wanted to show. 

Since for each t and for each j = 1, . . . ,M the random 
variables W/^ and depend only on the process 
until time £, we can define functions wj )t : (£2) — »> 



: = Wy-,t(tw) and w t (^[^ >t ]) := W^(oj») 
for some go e Q with = X[ tQ)t y 

It follows from (34) that 
J to 

for each ; and from (35) that 



M 



Wt(x[to,t]) = Y\ w M( x lto,t])- 



(37) 



The values of will serve as weights in the MTU particle 
filter. We will call wj )t the partial weights. Since in each dis- 
cretization scheme which is applied to solve the integral in 
the formula (36) for wj >t the integrand has to be evaluated, 
we may run into practical problems if we use it as it is writ- 
ten in the formula. If the density gj(yj | xtp tj) evaluates to a 
very small value and if we work with fixed-precision num- 
bers, the subtraction of 1 will result in a value which may 
be practically equal to —1. If this error accumulates over 
time, we may end up with wrong values for Wj,t{X[t Q ,t])< To 
reduce the computational error, the integral could be split 
up in the following way: 

w jA x [t 0 ,t]) = 1 - f Yjifj) dtj + f gjiyj | x tp tj)yj(tj) dtj 
J to ho 

= 1 ~ Yj,t + Wj,t(x[t 0 ,t]) (38) 



(39) 



where the cumulative distribution function 

Jto 

is independent ofx[ tQ)t ] an d where the part 

Wj,t( x [t 0 ,t]) := f gj(yj\xtj>tj)Yj(tj)dtj (40) 
ho 

depends on the path X[ tQ) t] . It is even more convenient to 
compute j/j )t by evaluating the antiderivative of yj, if it is 
computationally available. 

Note that the definition of the filter distribution is 
dependent on the reference measure fiy. t A suitable 
change of this measure may help to further increase the 
efficiency of the algorithm. This issue still has to be 
explored. 

Resampling 

Special attention is needed for the computation of the 
weights after resampling steps have been applied. As men- 
tioned earlier, resampling at time si is done by randomly 
generating a selection function t£ : I — >► / (with index 



Krengel etal. BMC Systems Biology 201 3, 7:8 
http://www.biomedcentral.eom/1752-0509/7/8 



Page 13 of 33 



set / = {1, . . . , N}) based on given non-negative (unnor- 
malized) selection weights v l S£ for each particle index i. 
The state samples (^ Se )._ 1 N have then to be replaced by 

the selected state samples (x l s e ®^ ^, and the unnor- 

malized weights (w^).^ by the corrected weights 

(w l s\^ /Vs\^^ • We assume that resampling steps at 

times si,...,S£ with to < s\ < • • • < si < t have occurred, 
states and weights have been replaced at these times, 
and within this paragraph, we denote them by x s t 1, '" ,Sl ' 1 
and Wf 1 ''''^ (x^.'i]*' 1 ), respectively, for each particle i. By 
definition, we have at t = si 

^si,...,sr,i _ si,...,s £ -i;ii(i) 



and 



u M,...,Si ( si,...,Sf,i\ _ 
W H \ X [to:Si] ) ~ 



5l,...,5£_! / 5i,...,5 £ ;A 

s * \ X [to:se] ) 



For each time t > si and for each particle /, the corrected 
weights are then recursively given by 



5i,...,5 £ _i / Sl ,...,s e ;i\ / s 1 ,...,s i ;i\ 

V t \ X [t 0 :t] ) _ W t \ X [t 0 :t] ) 



nLi^ (0 

(41) 

with 

ik:i(0 = U o ••• oii(i). 
Since the process W t computes the uncorrected weights 

WtiXfo.'t]*'*)' we nave to correct the weights explicitly in 
the algorithm by dividing them by the cumulative product 

i? s: ' :-f\^; lh - 

Note that we have to select these products during a resam- 
pling step similarly to the selection of the states, i.e. at 
t = si, 

ySi,...,s V ,i _ -s\,-,si-l'M(i) L £ (i) 

The MTU particle filter algorithm 

A practical MTU particle filter can be obtained by any 
discretization scheme based on an (arbitrary) time dis- 
cretization to = ro < x\ < . . . < T£>. Similar to the standard 
particle filter, sampling need not necessarily be done from 
the state process X[ t0)OO ^ directly. One can instead sample 
from another process X[ t0iOO ), provided that the Radon- 
Nikodym derivatives 

dC XtQ (x t0 ) K s ,t(x s ,dx t ) 

Qto&to) := T7 — ; — r and Qt\s(xt\*s) '= -~— — 

d %feo) K s , t (x s ,dx t ) 

for each 5, t £[to, t] with s < t exist and can be evaluated 
pointwise. (In fact, it suffices that Q t \ s (xt\ %s) exists for all 



states x s e X s which are reachable from some initial state 
%tQ ^ Xt 0 with Qt 0 (xt 0 ) 7^ 0). In the special case that we 
sample from the Markov kernels of ^[£ 0 ,oo) directly, Qt\s = 
1 for all s < t. Our MTU particle filter is described in 
algorithm 2. Here we suppress the indices si, . . . ,S£ in the 
notations of states and weights (see last paragraph). 

The algorithm as it is written here has to be enriched 
with concrete discretization methods for the sampling 
and update steps. For instance, if the process ^[£ 0 ,oo) is a 
multidimensional Ito process defined through a stochastic 
differential equation (SDE) 

dX t = a(X u t) dt + B(X U t) dW t 

with drift a(x, t), diffusion matrix B(x,t), multidimen- 
sional standard Wiener process Wt, and initial variable 
Xt 0 , then the Euler-Maruyama method can be used for dis- 
cretization. We set Ar^ : = r^ — td-i- The sampling is then 
done by 



= a ( X Td-i> x d-i) A ?d + B (x l . 



Td-l> ' 



(42) 

where is a. sample from a standard normal distribution 
(with mean 0 and variance 1). Further, the update step can 
be done in the simplest case using Euler discretization: 

Yj,t d = htd-! + y;(T rf _i)AT rf (43) 

and 



h^d h*d- 



_! +&(yj I x \ d _^ ^-i)y/fe-i)Arrf. (44) 

Of course, better discretization schemes are possible. As 
we have already mentioned, if the antiderivative Gj with 
Gj(fo) = 0 (which is in fact the distribution function) of yj 
is available, then one should rather use 

Yj,Td = G j( T d) 

for the computation of the values yj iXd . 
Adaptive stepsize 

To be able to fully exploit our MTU particle filter method, 
the discretization stepsize must be chosen appropriately. 
One simple possibility is to use a very small stepsize 
throughout the complete procedure. A quite high compu- 
tation time will result from that. This can be reduced if 
an adaptive stepsize is chosen. We propose to determine 
the stepsize Ar^ online depending on the ESS estimate. 
The stepsize should decrease when the ESS drops rapidly, 
and it should increase again if the ESS estimate changes 
only marginally. In detail, the following procedure can be 
applied. 

In each step of the algorithm, we obtain an initial guess 
of the stepsize by a linear interpolation between a maximal 
stepsize if the ESS had not changed since the last step, and 
a minimal stepsize if the ESS had dropped by the number 
N of samples (actually, the maximal difference that can be 
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Algorithm 2 MTU particle filter 

l: {At time To = to:} 

2: Sample N state realizations x l TQ , i = 1, . . . , N, of X TQ , with N large. 

3: for; = l,...,Mdo 

4: Set y 7 ; To = 0. 

5: end for 

6: for / = 1, . . . ,N do 

7: Set the weight = Q to (4 0 ) . 

8: for; = 1, ... ,M do 

9: Setu>'. =0. 

Mo 

10: end for 

li: Set the cumulative product v l of the selection weights to 1. 

12: end for 

13: for all times r^, d = 1, . . . , D: do 

14: {Resample the particle set (V rrfl );=i,...,Af if necessary (e.g. if the ESS drops below a threshold):} 

15: Generate a selection function i according to some selection weights v\ d l ♦ 

16: for i = 1, . . . ,N do 

17: Replace the state sample x\ by the selected state samples x^_ x ♦ 

18: for; = 1, ...,Af do 

19: Replace the partial weight w l - Xd i by the selected partial weight ^ 

20: end for 

21: Replace the cumulative product of the selection weights V by v L ^ Vx d _ x . 

22: end for 

23: for i = 1, ... ,N do 

24: Sample (at least approximately) a realization from the Markov kernel 

25: for; = 1, ...,Mdo 

26: Compute (at least approximately) 

Yj,x d = Yhx d _ x + / 7/(0 d£ or (if available) Yj,x d =\ Yj(t)dt 

27: and 

= + f " &(yj I 4 df. 

28: end for 

29: Set the total unnormalized weight to 

< = \i n l 1 - n« + <r d ) ■ 

7=1 

30: end for 

31: For given functions /z, compute approximations to the expectations of the filtered process: 

J2f-i <MA d ) 

E Td [h(X Td ) | Y 1:M =y 1:M ] « . 
32: end for 
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obtained is N — 1). From this initial guess, we compute the 
increments of the partial weights, and from them we pre- 
dict the ESS in the next step based on the current stepsize 
guess. If the difference between this predicted ESS and the 
current ESS drops by more than a certain relative amount 
(we use 10%), then a new guess of the stepsize is com- 
puted by dividing the current guess by 2. With this new 
guess, a new predicted ESS is computed, and the test can 
be applied again. This procedure will be applied iteratively 
until either the difference between predicted and current 
ESS drops by less than the prescribed amount, or the 
stepsize guess falls below a prescribed minimal stepsize. 
The current stepsize or the minimal stepsize, respectively, 
is then accepted, and the algorithm proceeds with this 
stepsize. See algorithm 3 for a formal description of the 
stepsize determination. 
If we sample from the Markov kernels of X^oo) directly, 

then Q T *\r d -i (x l T * = 1 an d the update of the 

weights does not depend on the new states x\ * . Hence we 
only need to compute the weight update and the corre- 
sponding ESS estimate until we find an adequate stepsize. 
In this case, it is not necessary to sample the new states in 
each iteration which renders the algorithm computation- 
ally more effective. 

Note that this procedure cannot be performed when the 
measurement times are fixed (i.e. in the standard parti- 
cle filter). In this case the ESS does not depend on the 
stepsize, and a reduction of it will not improve the ESS. 
The application of the MTU particle filter with distributed 
measurement times is essential to be able to use this 
adaptive stepsize procedure. 

Data Likelihood 

As mentioned earlier for the standard case of the particle 
filter algorithm, without resampling, the data likelihood 
could be approximated by the empirical mean of the 
unnormalized weights (see (19)). In our case, this would 
be 



i—l i=l 



After each resampling step, we have to correct this for- 
mula. Using the same notation as in the paragraph on 
resampling, for each time t > S£ and for each particle /, 
the corrected estimate is given by 



U=l i=l / i=l 

( si,...,s t ;i\ 
\ X [to:t] ) 



N \ N 



(45) 



nix e 



Algorithm 3 Determination of the adaptive stepsize 
1: Compute an initial guess of the stepsize: 

|WESS,^_ 2 - «ESS,r d _il 



At} — Ar max — (Ar max — Ar m i n ) • 



N 



2: Set r* = r^_i + ArJ. 

3: 
4: 
5 



loop 

for i = 1, . . . , N do 

Sample (at least approximately) a realization x l * 

T d 

from the Markov kernel 



6: for j = 1, . . . , M do 

7: Update the weights: 



and 



8: end for 

9: Set the total unnormalized weights to 



U = +g > ( y > 1 ^-0 yj^-^ Ar d- 

for 

the total unnormalize 



n(i-^+^). 



10: end for 

11: Compute the tentative ESS estimate: 
1 

^ESS,t 



where 



a 

are the normalized weights. 

12 . if |WESS ^-l- W ESS,r*l > ai ^ d > A 



Set ArJ to max (^f-, AT m i n j and repeat the loop, 
else 

Set Ar^ = ArJ and Xd = ^ an< ^ Drea ^- 
end if 
end loop 



with 

(see (41)). This can be seen by considering recursively the 
correction at the time of the resampling step L Since 



MO 



y N 1 v v 

L^v=\ v si 

is the expected number of times the particle i has been 
selected after N draws, we have to correct the weights 
^i,...,s£-i - n se | ect j on ste p £ j-jy dividing them by 
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this number. The following computation then proves the 
correctness of the formula via induction: 




U=l i=l / i=l 

We get algorithm 4 for the computation of the empirical 
estimate of the data likelihood, which needs to be done in 
parallel to the MTU particle filter algorithm. 

Offline and online estimation 

Two main cases of estimation procedures may be distin- 
guished: the offline estimation procedure used for exam- 
ple for parameter estimation, and the online estimation 
procedure for control purposes. While in the offline case 
the measurements are completely available before estima- 
tion begins, we know only some of the measurement val- 
ues at some certain time t during the online procedure. In 
this latter case, an online computation where all measure- 
ment times have been modelled with probability densities 
with infinite support is impossible, because in this situ- 
ation all measurement values must already be known at 
time to. Online estimation is nevertheless possible if the 



Algorithm 4 Estimation of the data likelihood 

1: At time xq = to: 

2: Set the correction factor for the data likelihood to Z To : = 
1/N. 

3: for all times r^, d = 1, . . . , D do 

4: if the particle set has been resampled then 

5: Set the correction factor for the data likelihood to 




i=l 



6: else 

7: Set 

8: end if 

9: Compute the empirical estimate for the data likelihood 
at time by 

N 
i=l 

10: end for 



support of the measurement time densities is finite. The 
online estimation must then be delayed by the diameter of 
the respective supports. 

Implementation 

We have implemented the proposed algorithm in Math- 
ematica as part of a Parameter Estimation Toolbox for 
Systems Biology developed by the Systems Biology group 
at the Fraunhofer Chalmers Centre (FCC) in Goteborg 
(Sweden). Furthermore, we have implemented it also in 
the statistical computing language R [28]. All figures in 
this article have been created using this R implementation. 

Results and Discussion 

Motivating example - results 

In this section, we resume our motivating example and use 
it in a parameter estimation setting to compare our MTU 
filter to both the standard particle filter and to a state-of- 
the-art Maximum Likelihood (ML) method which is not 
based on Monte Carlo techniques. To this aim, we first 
create virtual "measurements" at four intended measure- 
ment times ijyj = 1, . . . , 4 by simulation runs with our true 
model (see Table 1). This is done as follows: We simulate a 
single state path (q(t))te[o,io] based on the "true" parame- 
ter values a = 1 and /3 = 3. Then, for each intended mea- 
surement time tj, we sample an actual measurement time 
tj according to the density yj. Now, for each tj, we sample a 
measurement value yj from the distribution N(q(tj), Oy). 
Figure 2 shows one set of measurement values obtained in 
this way. The intended measurement times are 0.5, 1, 2, 4 
and the measurement values we got from one simula- 
tion run are 1.083346, 2.550290, 2.700863, 2.949450. We 
will use these values in the following estimation runs. Our 
aim is to estimate the parameter vector 0 = (a,/3) with 
the following three differrent estimation procedures and 
compare the results: 

1. ML estimation on "lumped" measurements, 

2. Standard PF on "lumped" measurements, and 

3. MTU-PF. 

Note that in both the standard PF and the ML estimation 
method, it is not possible to include the uncertainties in 
the measurement times directly. Rather, we have to lump 
these uncertainties with the uncertainties in the measure- 
ment values by increasing their variances Oy and to treat 
the tj like true measurement times. In both cases, we 
tested several values of "lumped" variances Oy. 

We have implemented the standard particle filter and 
the MTU particle filter in the statistical computing lan- 
guage R [28]. The ML estimations have been done using 
the Parameter Estimation Toolbox for Systems Biology 
developed by the Systems Biology group at the Fraunhofer 
Chalmers Centre (FCC) in Goteborg (Sweden). 
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Figure 2 Measurement distribution with a set of possible 
measurements. The dashed dark-green line depicts the nominal 
evolution of the state q over time. The green shaded area depicts the 
region where the measurements are expected. The yellow circles 
depict a set of possible measurement values. 



NIL estimation 

The ML estimation is a standard method based on a Max- 
imum Likelihood approach, that is, on the maximization 
of the likelihood function L(0;y\ : M) with respect to the 
parameter vector 0. The computation of the likelihood 
requires the use of estimates of the state mean and covari- 
ance matrix which are routinely obtained by the use of a 
non-linear derivation of the Kalman filter. 

Given measurements y\-M at a total of M time points tj 
which are assumed to be fixed and known, the likelihood 
function can be written as 



L(0;y 1:M ) : =f YhM ' ThM '°(yi: M ;0) 



M 



7=1 



(46) 



where f Y ^ Yl '^ 1,ThM ' 0 is the conditional probability den- 
sity function of the y-th observation yj given all previous 
observations yi-j-i as given in (9), here explicitly based on 
the given parameter vector 0, As already mentioned, ana- 
lytical solutions are not generally available. An exception 
to this is the case that all measurements are condition- 
ally Gaussian, that means that all conditional densities 
jYj\Yi : j-i,Ti, M ,o are Gaussian. In this case, the Kalman filter 
yields the correct solution. The assumption that the mea- 
surements are conditionally Gaussian is commonly used 
as an approximation even in those cases where this is not 



true. If one accepts that this assumption gives an approx- 
imation which is close enough to the true model, then 
it is only necessary to propagate the means and covari- 
ances of the measurements, since the Gaussian probability 
distribution is completely characterized by the first two 
moments. We introduce the notation 

yj\j-\ = E[yj\yi:j-v>0] 

and 

%_i = Cov[yj\y hJ -i;0] 

for the prediction of the mean and covariance of the 
observation variables, respectively. The computation of 
these values can be achieved by some derivates of the 
Kalman filter, commonly used are the Extended Kalman 
Filter (EKF) or the Unscented Kalman Filter (UKF), more 
exactly those versions of these filters which are time- 
continuous in the states and time-discrete in the measure- 
ments (continuous/discrete EKF and UKF, respectively). 
The residuals e ; are then denned by the differences 
between the measurements yj and their estimations J/l/'-l' 

The assumption of normally distributed observations 
leads to an approximation of the likelihood function of 
the following form 

L(0;y 1:M ) « L(0;y 1:M ) := Y\ ^ '—^ f 

where / is the number (dimension) of the observation 
variables. The negative logarithm of this approximated 
likelihood function is 

1 M 

- \ogL(P; yi . M ) =^J2 (log(det(%_i)) + ^R^j) 

7=1 

Ml 

+ — log(27T). 

The problem to finding maximum likelihood estimates 
of the model parameters takes the form of a nonlinear 
optimization problem: 

0 = argmin{-logl(6>;ji^)}. 

<9g0 

Roughly, the estimation with this approach is done as 
follows: 

1. Choose an initial guess 6* for the parameter vector 0. 

2. Based on the current parameter vector 6* and for 
each (intended) measurement time tj, compute 
residuals ey and estimates of the covariance matrices 
Rj\j-i using (an approximation to) the Kalman filter. 

3. Compute the likelihood L based on the state 



estimates 6/ and R 



7l7-i* 
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Figure 3 Parameters a and ft estimated by ML estimator. Box plots of estimated parameters of 1 00 runs each. The medians are depicted with 
circles. The bottom and top of the box are the 0.25-quantile and the 0.75-quantile, i.e. 50% of the values lie within the box. The whiskers mark the 
0.025-quantile and the 0.975-quantile, i.e. 95% of the values lie between the whiskers. The horizontal red lines denote the true parameter values. 



4. Use one step of some local optimization technique to 
find an improved parameter vector #* ew that 
decreases the negative log-likelihood function; 
replace 6* by this improved parameter vector 0* ew . 

5. Repeat steps 2 to 4 until the parameter vector shows 
convergence. 

The results of several runs of parameter estimations for 
different values of the lumped measurement variances Oy 
are shown in Figure 3. For each <x^, we performed 100 
runs with different initial state and parameter values. Note 
that the choice of an initial parameter value is mandatory 
for all approaches based on a local optimization method, 
and that different choices may lead to different results. On 
the other hand, in our example, we assume a log-normal 
prior distribution for the initial states (see Table 1), while 
the Kalman filter based approximations always assume 
normally distributed initial state values. Conditioned on 
the intial state (and current parameter), our example sys- 
tem is Gaussian linear which means that the Kalman filter 
yields exact estimates. We therefore decided the use the 
following procedure: Before each run, we sample the start- 
ing values for the parameters from the same priors as are 
available to the particle filter (see Table 2), and we also 
sample initial values qo for the states from the correct dis- 
tributions (see Table 1). Then we start the estimation pro- 
cedure based on these initial parameter and state samples, 
and we keep the resulting estimated parameters as one 
"sample". We repeat this procedure for all of the 100 runs. 
In this way, the standard procedure uses exact the same 
information as the particle filters, with the only exception 
of the measurement time distributions yj. Figure 3 shows 
the distributions of the estimated parameter values as box 
plots. 

The estimated parameters of all estimation runs are far 
away from the true values; even quite large variances Oy do 
not improve the performance of the estimations. (We also 
tested the ML approach on a different data set of 4 mea- 
surements which was created without randomness in the 



measurement times: the ML approach performed much 
better in this case since the estimated parameters varied 
around the true values; data not shown). 

Estimation with particle filters 

Tests with the particle filters (both standard and MTU) 
have been performed in the following way: First we per- 
form an estimation run where we use these measurements 
to estimate the parameters a and /3 of the system equation 
dq(t) = (~aq(t) + p) dt + o d W t . We perform this esti- 
mation with the MTU particle filter and with the standard 
particle filter under the same conditions and with the 
same seed for the random number generator. Thus the ini- 
tial distribution of the particles is the same in both cases. 
For the standard filter, we use different levels of variance 
Gy for the measurement noise. Afterwards, the empiri- 
cal medians of the final parameter distributions are used 
to perform a simulation run in each case, where we esti- 
mate the state q of the system based on the measurements 
and these fixed parameters. For both runs, we compute an 
estimate of the data likelihood over time. The choices of 
stepsizes and artificial parameter dynamics can be found 
in Table 2 (the values for the artificial parameter dynam- 
ics are computed by interpolation in the same way as we 
will describe later in the application of the MTU-PF to the 
plasma-leucine model). 

Table 3 shows a comparison of the estimated parameters 
and data log-likelihood values. The estimated parameters 



Table 2 Choices for the PFs for the motivating example 



prior for estimation of a 


log-A/*(log(2),1 2 ) 


prior for estimation of p 


log-A/*(log(6),1 2 ) 


stepsize for standard particle filter 


io- 2 


maximal stepsize for MTU particle filter 


10- 2 


minimal stepsize for MTU particle filter 


icr 6 


diffusion parameter for artificial parameter 


5.43/(t + 3.29) 2 


noise at time t 
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Table 3 Estimated parameters and data log-likelihood values for the MTU particle filter and the standard particle filter 
with different standard deviations ay for the measurement noise 





ay 




ft 


Data log-likelihood 

F^tiinatinn run 


Data log-likelihood 

^imiilatinn run 

•Jl 1 1 1 \A Id 11 KJi 1 IUII 


MTU 


0.005 


1.012 


3.010 


-4.327 


2.398 


standard 


0.005 


7.031 


20.712 


-24.084 


-744.970 


standard 


0.01 


9.102 


26.919 


-23.936 


-890.746 


standard 


0.1 


4.709 


13.847 


-9.081 


-140.117 


standard 


0.25 


1.425 


4.171 


-6.714 


-4.618 


standard 


0.5 


1.156 


3.287 


-5.538 


-2.170 


standard 


0.75 


1.318 


3.604 


-5.807 


-3.160 


standard 


1 


1.450 


3.733 


-6.227 


-4.100 



The true parameters are a = 1 and p = 3. 



(empirical medians) of the MTU particle filter (a = 1.012 
and /3 = 3.010) are very close to the true values (a = 1 and 
p = 3) whereas the estimated values of the standard parti- 
cle filter are significantly worse. The data log-likelihood in 
both estimation and simulation run is considerably higher 
for the MTU filter, compared to any of the standard filter 
runs. 

In Figure 4, we show the distributions of the filtered 
states obtained by simulation runs with the estimated 
parameters. The violet shaded area in Figure 4(a) depicts 
the estimated state distributions of the state q for the 
MTU particle filter with the light-blue line marking its 
median. The dashed dark-green line depicts the nominal 
evolution of the state q over time which should be approxi- 
mated by the filter. Figures 4(b-d) show the estimated state 
distributions of the state q for the standard particle filter 
with different assumed lumped measurement variances. 
As can be seen from these figures, the approximation by 
the MTU particle filter is very close to the evolution of 
the state with the true parameters. On the contrary, how- 
ever we choose the measurement variance in the standard 
case, the algorithm is not able to adequately approximate 
the correct state evolution. 

Figure 5 displays the simulated measurement distri- 
butions corresponding to the filtered state distributions 
based on the simulation runs with estimated parameters. 
Here we notice again that the MTU particle filter can 
adapt well to the situation, whereas the measurement dis- 
tributions which can be realized by the standard filter do 
not fit well with the data points. 

In addition to the significant improvement in the 
parameter and state estimation, the MTU particle filter 
has another benefit. Figure 6 shows the development of 
the effective sample size estimate during the estimation 
runs. In the standard particle filter runs where the lumped 
measurement variance is assumed to be small, the pre- 
dicted values for states and parameters do not fit well with 



the measurements. At the measurement times, when the 
predicted states are compared with the measurements and 
weighted accordingly, most weights decrease rapidly. This 
leads to a high variance of the weights and the effective 
sample size estimate drops severely, indicating that only 
very few particles effectively contribute to the estimation. 
This degeneration of the particle cloud is prevented by the 
MTU particle filter, as can be seen in Figure 6(a). For the 
standard filter runs where Oy is assumed to be larger, the 
ESS estimate does not drop as severely as with a small oy 
This is due to the fact that, with a measurement variance 
which is assumed to be very high, more or less all pos- 
sible simulated measurements are considered to fit well 
with the true measurements. However, these niters are not 
able to establish reasonably good estimates of the states 
and parameters and are clearly outperformed by the MTU 
particle filter. 

Application to the Plasma-Leucine Model with Population 
Data 

In this section, we apply our MTU particle filter algorithm 
to a leucine kinetics model (Demant et al. [29] based on 
Cobelli et al. [30]) with data taken from a clinical study on 
diabetes patients ([31,32], see Additional file 1). We per- 
form Bayesian population-based (i.e. NLME) parameter 
estimation with this model. The model and the data have 
been previously used in a maximum likelihood estimation 
context in [17]. It should be noted that the original ODE 
model needs to be extended by some kind of stochas- 
tic process variability in order to turn it into an SDE 
model. The approach taken in [17] is different from our 
approach in the way that in [17] the stochastic fluctuations 
are assumed to be in the tracee (plasma leucine) while we 
here assume the variability to be in the tracer (labelled 
leucine). Both assumptions are plausible; a final decision 
on the best way to model the process dynamics has not yet 
been made. 
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Figure 4 Filtered state distributions based on simulation runs with estimated parameters for the motivating example. MTU particle filter 
(a) and standard particle filter with several different assumed lumped measurement variances ay (b-d). Violet shaded area: Filtered state distributions 
based on empirical quantiles. Solid blue line: Median of filtered states. Dashed dark-green line: nominal evolution of the state q over time. 



The Leucine model 

In [31] (see also the thesis [33]), a new combined 
multicompartmental model for apolipoprotein B-100 
(apoB) and triglyceride metabolism in very low den- 
sity lipoprotein (VLDL) subfractions has been developed, 
see Figure 7. VLDL are transporters of triglycerides and 
cholesterol from the liver to the periphery, and elevated 
levels are associated with increased risk for cardiovas- 
cular events. Each VLDL particle has exactly one apoB 
molecule attached which makes apoB a suitable marker 
for triglyceride transport. 

The secreted particles become denser and denser as 
triglycerides are delivered to target organs such as mus- 
cles and adipose tissue and the relative protein content is 
increased. As the density increases, the VLDL becomes an 
intermediate density lipoprotein (IDL) and finally a low 
density lipoprotein (LDL). 



For our purposes we use only the part of the model con- 
cerning the leucine pool (compartments 1-4), see Figure 8. 
This subsystem was first used for apoB kinetic studies by 
Demant et al. [29] as an implementation of the work by 
Cobelli et al. [30]. The output is at compartment 1 and 
at compartment 2. The influx into compartment 1 will be 
denoted U\. 

The data are obtained from tracer/tracee experiments. 
Here the tracee (i.e. the concentration we are actually 
interested in) is the leucine amino acids in the apoB 
molecules. Additional labelled leucine (tracer) is injected 
as a bolus infusion. Knowledge about the kinetics (fluxes 
between compartments) of the tracee can be gained by 
studying the kinetics of the tracer. In the restricted model 
four compartments are considered: plasma leucine (1), 
intra-hepatic leucine (2), and two plasma protein pools 
(3 and 4). In the full model, additional compartments 
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Figure 5 Simulated measurements corresponding to the filtered state distributions for the motivating example. MTU particle filter (a) and 
standard particle filter with several different assumed lumped measurement variances oy (b-d). Violet shaded area: Measurement distributions 
based on empirical quantiles. Solid blue line: Median of simulated measurements. Yellow circles: Measurements. 



represent VLDL subfractions (compartments 5-11 in 
Figure 7). The four-compartment system is linked to com- 
partments 5-11 through compartment 2. 

For each compartment i where i = 1, . . . , 4, let Q; and qi 
denote the mass of the tracee and the tracer, respectively. 
Similarly, let U{ and U{ denote the input for the tracee and 
the tracer, respectively. For tracer/tracee experiments, Q; 
is assumed to be in steady state. If the concentration level 
of the labelled injection is small compared with the over- 
all concentration levels, and if the model is linear, then 
approximately 



dq(t) 
dt 



: K(t)q(t) + u(i) 



where q(t) = (^(0)^=1,2,3,4* u ^ = (^i(0> 0, 0, 0) T and 
Kit) = (k jti )j 7 i=l,22A 



where kjj for i ^ j is the transfer coefficient of the tracers 
from compartment i to compartment ; (compartments 0 
and 11 are considered to be output compartments), and 
for each i = 1, . . . , 4 



hi := 



E hi- 



7=0,1,2,3,4,11 



Throughout this paper, the time unit used is hours, all 
fractional transfer coefficients are given in the unit h _1 , 
and the amount of material in compartments is given in 
mg. In our model, only /<o,i, /ol,2> ^1,3 > fe,i> fe,i> ^3,4> ^4,3 
and kn } 2 are assumed to be non-zero, while additionally 
the following dependencies of the transfer coefficients are 
assumed to be valid: 

h,2 = k 2 ,i 

/<3, 4 = 0.1 • /< 4 ,3 
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Figure 6 Estimated effectice sample size during estimation run. MTU particle filter (a) and standard particle filter with several different assumed 
lumped measurement variances a y (b-d). The dotted blue line denotes the resampling threshold. 



The fractional transfer coefficient kn t 2 has to be fixed 
for the system to be identifiable. We set k\\ t 2 = 
O.Olh -1 , as an estimated average from current results. 
We build stochastic differential equations (SDEs) from 
the resulting ordinary differential equations by adding 
noise terms which are given by standard Wiener pro- 
cesses Wi,t, . . . , W^t multiplied by diffusion parameters 
<7i, . . . , 04, respectively. The leucine pool subsystem which 
we consider here (compartments 1-4) interacts with the 
surroundings only via an initial input into compartment 1 
at time t = 0, an output from compartment 1, and another 
output from compartment 2 (towards compartment 11). 
All other flows are processes acting inside the subsystem 
and hence should follow the principle of mass conserva- 
tion. Therefore we add the stochastic terms to the ODE 
system in the following way: 



dqi (t) = - (k h2 + k 0) i + /< 3 ,i) (qi (f) dt + a x dWi (£)) 

+ h, 2 (q 2 if) dt + a 2 dW 2 (f)) 

+ k lf3 (qs (t) dt + a 3 dW 3 (£)), 

dq 2 (t) = - (/<ii, 2 + k 1}2 ) (q 2 (t) dt + a 2 dW 2 (£)) 

+ ha (qi(t)dt + aidWi(t)), 



dq 3 (t) = 



dq*(t) 



(ki, 3 + h,3) (qs (t) dt + as dW 3 (£)) 

+ 0.1/c 4 ,3 (q^t) dt + cr 4 dW 4 (0) 

+ k 3 ,i (qi(t)dt + cr x dWi(t)) t 

- 0.1*4,3 fait) dt + <t 4 dW 4 (£)) 

+ /c 4 ,3 (q3(t)dt + cr 3 dW 3 (t)). 



We fix the diffusion parameters to be o\ = 02 = 03 = 
(J4 = 3. Initial conditions are given by 

qi(0) = q 2 (P) = q 3 (0) = q^O) = 0. 

The patients get a bolus injection and therefore the input 
u\ (t) will be modelled as a delta distribution at time t = 
Oh, 

ui(t) = u 1>0 8(t). 

Practically, this means that only the initial condition q\ is 
affected by it, and we can replace the initial condition for 

qi b y 

qi(0) = 1*1,0, 
and set u\ (t) = 0 in the differential equations. 
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Figure 7 Multicompartmental model for apolipoprotein B-100 (apoB) and triglyceride (TG) metabolism in very low density lipoprotein 
(VLDL) subtractions. This multicompartmental model was developed in [31]. Circles depict compartments and arrows depict fluxes between 
compartments. The model includes separate modules for leucine (yellow) and glycerol (red). The free leucine plasma kinetics is modeled by two 
pools (3 and 4) and a plasma compartment (1 ), which interchange materials with an intrahepatic compartment (2). Compartment 2 feeds the apoB 
synthetic machinery. For glycerol, the plasma compartment (1 3) is connected to a pooling compartment (1 2) and feeds TG synthesis, which consists 
of a fast pathway (1 4) and a slow pathway (21 ). The assembly of lipoprotein is modeled by separate delays for apoB (1 1 ) and TG (22). The plasma 
kinetics of apoB and TG is modeled by a four-compartment hydrolysis chain, consisting of compartments 5, 6, 8, and 1 0 for apoB and compartments 
1 5, 1 6, 1 8, and 20 for TG. Compartments 5/1 5 and 6/1 6 are associated with VLDL1 , together with a slowly decaying compartment 7/1 7. 
Compartments 8/1 8 and 1 0/20 together with the slowly decaying compartment 9/1 9 form the VLDL2 module. Lipolysis of TG is modeled to take 
place in the transfer between two compartments. For details on the full model, see [31 ,34], or [33]. For our purposes, we use a restricted model 
consisting of the leucine pool, e.g. compartments 1 to 4, see also Figure 8. 



The same differential equations, without noise terms, 
are assumed to be valid for the states Qi and the input U\ 
of the tracee: 



dQ(t) 
dt 



K(t)Q(t) + U(t) 



where Q(t) = (Qi(t))] =12 3A , U(t) = (£/i(f), 0, 0, 0) T . It 
is assumed that the tracee input U\(t) = U\ is constant 



kn,2 




Figure 8 Schematic depiction of the restricted model (leucine 
pool) [33]. This scheme is a subscheme of Figure 7. Circles depict 
compartments. Arrows depict fluxes between compartments and are 
labelled with the corresponding fractional transfer coefficients. 
Compartment 1 is the plasma-leucine compartment where the 
leucine is injected. Compartment 2 is an intrahepatic compartment 
which is the source for apoB synthesis. Compartments 3 and 4 are 
body protein pools. The output is at compartment 1 . Compartment 
1 1 is a delay compartment used only as output from compartment 2. 



but unknown. We will therefore estimate it together with 
the transfer parameters. Since for the tracee steady state is 
assumed (i.e. dQ;(£)/ dt = 0), it is possible to solve those 
equations for Qi(t), and we get: 



Qi(f) 



(*ll,2 + *l,2)tfl 



*0,l(*ll,2 + fa) + £11,2*1,2 



The output is a measurement proportional to the ratio 
between the tracer and the tracee, disturbed by log- 
normal noise: 



- Log- M (0, Gy x ^ independently 
for each t, 



where we assume the value of the variance parameter to 
be Gy x — 0.5 2 (this is the variance of log i= t ). The parame- 
ter pi denotes the unknown proportion of plasma leucine 
that actually is in the plasma. The parameters p\ and U\ 
are not jointly identifiable, therefore we fix p\ = 0.65. 
More details concerning the deterministic model (without 
noise terms) can be found in [31] and [33]. Note that the 
stochastic disturbances are not part of the original model, 
they are rather augmentations of the model used in this 
article. 

The mixed effects model 

The model as presented in the last paragraph contains 
only flux parameters kjj which are assumed to be the 
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same for every individual Neither does it account for indi- 
vidual differences between several persons, nor does it 
account for possible changes of the flux parameters when 
the persons under consideration are affected by a disease 
or a treatment. To be able to treat these differences in an 
appropriate way, we introduce group and patient specific 
parameters in the model; namely the transfer coefficient 
/<o,i will be split into a group dependent and a patient 
dependent part. In this way, we introduce so-called mixed 
effects into the model. Mixed effects generally increase the 
difficulty in inference making. In the following test runs, 
we will use measurement data previously reported as indi- 
vidual data for a total of 34 subjects [31,32]. Among them, 
15 patients belonged to the diabetes group, and 19 to 
the control group. From experiments, it can be observed 
that the degradation rate /<o,i of plasma leucine is signifi- 
cantly different for people with and without diabetes. We 
therefore assume that the expected value of /<o,i in each 
group differs and may be either /cq x or I<q x corresponding 
to the diabetes group and the control group, respectively. 
Additionally, we assume that we have patient-dependent 
random factors £ p modelling the parametric uncertainties 
among individuals, such that finally 

(p) £/^o,i if tne P at ient p is in the diabetes group, 
I £pko,i if tne P at ient p is in the control group 
where all £ p 's are static and independently log-normally 
distributed: 

$ p = exp(^) with ri p ~Af (o, afy 

independently for each p 

for p = 1, . . . , 34. As a consequence, each of the states 
#i,...,#4 has to be considered separately for each 
patient p. We indicate this by writing qf\...,qf\ 
p = l,...,34. 

The aim of the estimation runs is, apart from estimating 
the remaining parameters, to show that the group depen- 
dent parameters I<q V I<q X are indeed different. We want 
to apply Bayesian estimation to the parameters. For this 
reason, we principally have to treat them like state vari- 
ables in the particle filter. Since the estimation of static 
variables is problematic with particle filter methods, it is 
standard to introduce small artificial stochastic dynamics 
to the parameters consisting of normal increments with 
decaying variances [35]. The same is true for the static 
noise parameters rj p which also are to be estimated. Our 
process X t is then given as an augmented state vector 

x t = (^V),/^ 

The overall model is thus a non-linear mixed effects model 
with three levels of effects (parameters), namely global 
parameters, group dependent parameters (I<q V /<oi)> an d 
personal parameters (l; p ). Nevertheless, since the core of 



the model is linear, i.e. the states q\, . . . , #4 conditioned on 
all parameters, a Rao-Blackwellization concerning the lin- 
ear parts of the model is possible, and the Kalman filter 
applied to this linear partial model can be used in combi- 
nation with particle filtering for the non-linear parts [36] . 
Since the model is used for demonstration purposes only, 
we did not use this technique although in principle it is 
possible. 

Estimation runs 

Estimation and simulation runs have been performed with 
data from all 34 patients (19 from control group and 15 
from diabetes group). The computer experiments have 
been done as follows. We first estimate parameters with 
the MTU particle filter. Separately, we estimate param- 
eters with the standard particle filter under the same 
conditions and with the same seed for the random num- 
ber generator. The initial distribution of the particles is 
then the same in both cases. For both runs, we compute 
an estimate of the effective sample size (ESS) and the data 
likelihood over time. Both estimates allow a performance 
comparison of the MTU versus the standard particle fil- 
ter. Secondly, the empirical medians of the final parameter 
distributions are afterwards used to perform simulation 
runs in both cases. Both versions of the particle filter, this 
time with parameters fixed to the estimated values, are 
used to perform these simulations. In these simulation 
runs, the data are used for the computation of the data 
likelihood conditioned on the estimated parameters. The 
resulting distributions of the simulated measurements can 
be compared to the true measurements, both visually and 
by observing the data likelihood. 

We used 10,000 particles and a resampling threshold of 
7,500. Stepsizes in the MTU filter are between 10 _7 h and 
10 _3 h, adaptively computed based on the ESS estimate. 
In the standard filter, we use a fixed stepsize of 10 _3 h. 
The data contain measurements until time t = 8h, but 
we perform estimations and simulations only until time 
t = lh, mainly to reduce computation time; anyhow, after 
t = lh, the tracer concentrations are relatively low and do 
not change considerably, and therefore the measurements 
are not expected to improve the estimations significantly. 
In our implementation, we directly sample from the states 
X t (i.e. X[t 0)O o) = ^[£ 0 ,oo) in law) using the Euler-Maruyama 
scheme for discretization [37]. 

As mentioned earlier, Bayesian parameter estimation 
with particle filters requires the introduction of artificial 
dynamic noise for the parameters. It is standard to use 
normal increments with decaying variances [35]. Since 
in our case all parameters with exception of the rj p s are 
assumed to be positive, the application of a "log-normal" 
noise (based on a geometric Brownian motion) in place 
of the standard normal noise seems to be more appro- 
priate for these parameters. In detail, it has been done as 
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follows. The priors and the distributions of the artificial 
noise are chosen to be log-normal for all parameters with 
exception of the individual parameters rj p which have nor- 
mal priors and noise. The prior for the parameters k% v 
I<q >v ki,2, /<i,3, h,i, /< 4 ,3 is Log-A/"(0, l 2 ), the prior for U\ is 
Log-A/XlogClOO), l 2 ). The prior for each rj p is A/"(0,0.5 2 ). 
The parameter update ("artificial noise") is generally per- 
formed according to 

dO it) =6(t)o e (t)&W e t 

i> k\ } 2, Ari,3, A:3,i, /c4,3, U\, and according to 

dO (t) =(T 0 (t)dW? 

for 0 = rj p . It is standard to decrease the variance of the 
artificial noise over time. In our case we have chosen the 
diffusion parameter oq of each artificial noise variable to 
be dependent on time via a quadratic function 

o 0 (t) =a 0 /(t-b e ) 2 

with parameter dependent coefficients clq and be* Prac- 
tically, ao and be have been determined by fixing two 
interpolation points (to, ere (to)) and (t\, cfe(h)). It holds: 

be = (h - t 0 )/ (l - y/ cr e (to)/ ere (h)) and 
ae = ere (to) (to - be) 2 . 



We found best performance with the following choices: 
For all parameters, we have chosen to = Oh, t\ = 2h and 
oe (t\) = ere (to) / 10 which means that the diffusion param- 
eter has dropped to 10% of its initial value at time t — 2h. 
The initial values are oe (to) = 0.5h for all parameters with 
exception of the rj p 's which have higher initial diffusion 
cr e (t 0 ) = lh. 

As mentioned earlier, we have performed two different 
estimation and simulation runs, one with the MTU par- 
ticle filter with distributed measurement times, and for 
comparison one run with the standard particle filter. In 
the MTU particle filter, the distributions of the measure- 
ment times are truncated normal distributions with mean 
equal to the nominal value of the measurement time and 
with variance 0.001 2 . The normal distribution is truncated 
at the time point 0.0 lh left and right from the mean value. 
In Figure 9, the development of the estimated effective 
sample size and the estimated data likelihood is shown, 
both with respect to time t. The development of predic- 
tions of the measurements during a simulation run with 
the final estimated parameters is shown in Figures 10 
and 11 for the run with the standard particle filter, and 
in Figures 12 and 13 for the run with the MTU par- 
ticle filter. Finally, in Figures 14 and 15, box plots of 
the final estimated global and individual parameters are 
shown, respectively. 
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Time (hours) 



final data log-likelihood: 136.207 



0.4 0.6 0.8 
Time (hours) 



MTU particle filter 



■wm i ■ 



minimal ESS: 7032.661 



I I 

0 0.2 0.4 0.6 0.8 

Time (hours) 

Figure 9 Development of estimated effective sample size and estimated data likelihood during estimation runs. Standard particle filter 
(top) and MTU particle filter (bottom). 
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Figure 10 Predicted measurement distribution overtime (standard particle filter) during simulation run. Patients from diabetes group. 

Development of predicted measurement distributions over time during simulation run with parameters estimated by the standard particle filter. 
Circles: Measurements. Solid line: Median of simulated measurements. Violet shaded area: Measurement distributions based on empirical quantiles. 



A comparison of the results of MTU-PF and standard 
particle filter shows that both algorithms exhibit very 
similar performance with respect to the quality of the esti- 
mated parameters, since in both cases the development 
of the data likelihood is very similar both for estima- 
tion and simulation. The estimated log-likelihood of the 
data at the final time is 137.239 for the MTU particle 
filter and 136.207 in the standard case, which is prac- 
tically equal. The computation time of the MTU-PF is 



only slightly higher than the one of the standard filter. 
Visual inspection of the simulation runs shows that the 
simulated measurements of the model with parameters 
estimated by both filters fit the data in a similar way. 
This impression is supported by the values of the esti- 
mated data likelihood. The MTU particle filter has a final 
log-likelihood value of 157.622, similar to the one of the 
standard filter with a value of 155.952. The difference 
is insignificant. 
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Figure 1 1 Predicted measurement distribution over time (standard particle filter) during simulation run. Patients from control group. 

Development of predicted measurement distributions over time during simulation run with parameters estimated by the standard particle filter. 
Circles: Measurements. Solid line: Median of simulated measurements. Violet shaded area: Measurement distributions based on empirical quantiles. 
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Figure 12 Predicted measurement distribution overtime (MTU particle filter) during simulation run. Patients from diabetes group. 

Development of predicted measurement distributions over time during simulation run with parameters estimated by the MTU particle filter. Circles: 
Measurements. Solid line: Median of simulated measurements. Violet shaded area: Measurement distributions based on empirical quantiles. 



In contrast to the insignificant differences concerning 
the resulting likelihoods between the MTU-PF and the 
standard PF, the development of the ESS estimate in the 
estimation runs differs remarkably. With the MTU parti- 
cle filter, the ESS estimate shows a high value at all times 
and does not drop below a value of 7032.661 during esti- 
mation. This is only slightly lower than the resampling 
threshold of 7500 (see Figure 9, upper part). The standard 
particle filter shows a much worse performance. As can be 



observed from the lower part of Figure 9, the ESS drops 
several times to very low values, reaching a minimum of 
101.102. The ESS measures in some sense the variance of 
the normalized particle weights (low ESS means high vari- 
ance) and highly varying weights indicate that the particle 
cloud is in a bad condition, since in this case a few particles 
with very high weights dominate the majority of the par- 
ticles with very low weights. The extreme case is ESS = 1 
where only one particle has a significant weight. After 
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Figure 1 3 Predicted measurement distribution over time (MTU particle filter) during simulation run. Patients from control group. 

Development of predicted measurement distributions over time during simulation run with parameters estimated by the MTU particle filter. Circles: 
Measurements. Solid line: Median of simulated measurements. Violet shaded area: Measurement distributions based on empirical quantiles. 
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Figure 14 Estimated global and group parameters. Box plots of estimated distributions for the global parameters and the group dependent 
parameters. The medians are depicted with triangles (standard particle filter) or circles (MTU particle filter). The bottom and top of the box are the 
0.25-quantile and the 0.75-quantile, i.e. 50% of the values lie within the box. The whiskers mark the 0.025-quantile and the 0.975-quantile, i.e. 95% of 
the values lie between the whiskers. The values for U] have been scaled by a factor of 0.01 in order to fit into the plot. 



a resampling step, only the information carried by this 
particle is present in the current particle cloud. Results 
obtained by estimation runs where such low ESS values 
have occurred cannot be trusted very much. If in con- 
trast a resampling step is performed while the ESS is only 
slightly below the resampling threshold, which is the case 
in the MTU-PF algorithm, then many different particles 
will be chosen during resampling and a high percentage 
of the information contained in the particle cloud will be 
carried over to subsequent steps. In this sense, the MTU 
particle filter avoids the degeneration of the particle cloud 
by controlling the value of the ESS and holding it at a 
high value at every time. As a consequence, the estimation 
results should be considered to be more reliable than with 
the standard algorithm. 

A look at the estimated values of the group parameters 
J<q x and I<q x (see Figure 14) shows that in both estimation 
runs (MTU-PF and standard PF), the rate A:q 1 for diabetes 
patients is only about 60 percent of the rate I<q x of the con- 
trol group (standard particle filter: 0.337h -1 vs 0.557h -1 , 
MTU particle filter: 0.346h _1 vs 0.577h _1 ). The good per- 
formance especially of the MTU particle filter strengthens 
the confidence in the obtained result and leads to the con- 
clusion that the secretion rate /<o,i is indeed lower for the 



diabetes patients than for the people from the control 
group. 



Conclusions and future work 

We proposed a new modification of the particle filter algo- 
rithm which works in continuous-time settings. It allows 
the direct inclusion of measurement time uncertainties 
in the underlying model. The modifications additionally 
allow the use of time-stepping strategies to improve the 
performance of the algorithm. The assumption of a ran- 
dom distribution of measurement times is natural in many 
applications. 

The MTU-PF method is generally applicable. Even when 
measurement times may be assumed to be concentrated 
on single time points, our method can be used as a kind 
of regularization of the standard particle filter method 
if artificial distributions with highly concentrated masses 
around the measurement points are introduced. 

We compared the performance of the MTU-PF to the 
standard PF and to an alternative Maximum Likelihood 
estimation method on a small artificial example. The 
results clearly show the advantage of the application of the 
MTU-PF in cases of uncertain measurement times. 
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Figure 1 5 Estimated individual parameters. Box plots of estimated distributions for the individual parameters. The medians are depicted with 
triangles (standard particle filter) or circles (MTU particle filter). The bottom and top of the box are the 0.25-quantile and the 0.75-quantile, i.e. 50% of 
the values lie within the box. The whiskers mark the 0.025-quantile and the 0.975-quantile, i.e. 95% of the values lie between the whiskers. 
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We believe that our MTU particle filter is especially 
suitable for biological/medical applications where — com- 
pared to technical applications — the variance of the 
measurement values is relatively high due to biological 
variation and because relatively few consecutive measure- 
ments are possible. We provided an illustrative application 
from a PK/PD study. A comparison of our MTU parti- 
cle filter with the standard filter showed that in this case 
our method is able to avoid weight degeneracy measured 
in terms of the Effective Sample Size (ESS) estimate. Even 
though the estimations in this application with both stan- 
dard and MTU particle filters are reasonably good, the fit 
to the measurements is still not perfect. Whether that is 
due to our choices of model and parameters, or due to 
the known weaknesses of the general SMC method (which 
also our modification necessarily suffers from), remains 
to be evaluated in greater detail. In subsequent studies, 
we plan to apply the new algorithm to the complete liver- 
plasma model using additional measurements to be able 
to draw conclusions of greater medical relevance. 

Another topic may also prove interesting for future 
work. Our experiments showed that the results are highly 
dependent on the choice of the development of the dif- 
fusion coefficients of the artificial noise necessary for 
Bayesian parameter estimation. While there is a general 
agreement that these coefficients should decrease over 
time, there is currently a lack of automated methods for 
making appropriate choices of the diffusion coefficients, 
both for initial values and dynamic development. How- 
ever, to provide a really practical method for parameter 
estimation in non-linear mixed effects models (or even 
in models which adhibit only global parameters), our 
approach could also be combined with methods better 
suited to the estimation of fixed parameters, a good can- 
didate being the PMCMC methods proposed in [11]. This 
is future work. 

In summary, we believe that the method presented in 
this article opens the door to even more efficient and 
reliable state sampling and parameter estimation meth- 
ods based on the particle filter algorithm operating on 
continuous-time stochastic state space systems. 



Notation 

(ft, A P) 
(X u B Xt ) 



probability space 
arbitrary measurable space 
(for each t e[to,oo) with 
t 0 e R) 

X t : Q — >> X t A-Bx t measurable random 

variable 

X[ to ,oc) :=(Xt)te[tb,oo) continuous-time Markov 

process with general state 
space Xfaoo) 

X[to,oo) ' = Y\t 0 <s X s state space ofX [t0fOo) 



' X [tQ,OQ) 



X[toA : - I\t 0 <s<t Xs 

^ x lt 0 ,t] 
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K s ,t(x s , dx t ) 



a(x, t) 
B(x, t) 



gj(yj\xt r tj) 

tj(j=l,...,M) 

Yjifj) 
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Kt hl ,tj(xt hl > dx tj ) 
Qtj\t h i(xtj \*tj-i) 



_ dC Xt0 (xt 0 ) 



Si 



the pushforward measure of P 
under X t , Le. Cx t (B) : = 
?{X^ l {B)) for all 5 e B Xt 
the pushforward measure of 
P under X[ tQ>oo) := (X s ) se [ tQ>oo) 
(with the corresponding 
product algebra) 
the state space restricted to 
the interval [to,t] 
the corresponding 
pushforward measure 
number of particles 
state samples at time t 
the Markov kernel of the 
process X[ t0)OO ) from time s 
to time t 
drift vector 
diffusion matrix 
multidimensional standard 
Wiener process 
measurable space 
observation random variable 
with values in measurable 
spaces (y^By.) 

conditional probability density 
with respect to a reference 
measure fiy. 

reference measure fiy. on 
(y^Byj) 

observation times 
random variables modelling 
the uncertainty about exact 
observation times 
Lebesgue measure on the 
interval [ to, oo) 
probability density of 7} 
with respect to ^[t 0) oo) 
Markov chain, pushforward 

measure, and kernel for 
importance sampling 
Radon-Nikodym derivative 

Ktj_ v tj(xtj_ v 6xtj) 

Radon-Nikodym derivative at 
start time to 

unnormalized weight of 
particle i at time t 

normalized weight of 
particle i at time t 
€-th resampling time 
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normalized selection weight 
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the l-th resampling step 
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data likelihood 


ft 




full density at time t 


ft 




filter density at time £, i.e. 
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conditional probability 
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1 otherwise 


Wj R>o 




stochastic process given by 
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expectation of h(X t ) given 






Yi'M = yi-M with respect to 
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M>o 


weight at time t 






cumulative distribution 
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< td 


time discretization 






cumulative product of the 






selection weights for particle i 


Ar^ = id — ?d-\ 




stepsize 






correction factor for the data 




likelihood at time Td 


Qi 




mass of the tracee in 






compartment i 


qi 




mass of the tracer in 






compartment i 


Ui 




input for the tracee in 






compartment i (e.g. U\ 






denotes the influx into 






compartment 1) 


Ui 




input for the tracer in 






compartment i 


hi 




transfer coefficient of the 




tracers from compartment i 






to compartment / 






diffusion parameter 






log-normal noise 






fe~Log-A/\0,a2)) 



p l degradation rate of plasma 

leucine for people in the 
diabetes group and the 
control group, respectively 
£p , rj p patient-dependent random 

factors modelling the 
parametric uncertainties 
among individuals (£ p = 
exp(r] p ) with rj p - A/*(0, cr^J) 



Additional file 



Additional file 1 : Data used for estimation and simulation. The data 
consist of one record for each patient, each record consisting of 4 lines in 
the following format: 

patient id [COMMA] group (control or diabetes) [COMMA] leucine initial 
value [NEW LINE] 

measurement 1 time [COMMA] measurement 2 time [COMMA] 
. . . [COMMA] measurement n p time [NEW LINE] 
measurement 1 value[COMMA] measurement 2 value [COMMA] 
...[COMMA] measurement n p value [NEW LINE] 
[NEW LINE] 

where n p is the number of measurements for patient p, time is given in 
hours, and each measurement value is the measured ratio of isotope 
labeled leucine. 
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